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DETAILED ACTION 
Response to Amendment 

1 . Applicant's arguments filed 5/1 5/2007 have been fully considered but they are 
not persuasive. Saylor et al. fully anticipate the limitation regarding "wherein each of 
said interactive voice response applications includes an executable component for 
execution by said hosting system, said executable component comprising at least one 
of an executable file, a Java Bean, a Corba-component, a compiled software module, 
and a pre-com piled software module" (VP AGE Database 50 in figure 3, voice response 
application includes TML, XML, VoiceXML, WML, and others in col. 21, lines 10-45; 
these are executable components). 

2. In response to applicant's argument regarding "the references fail to 

teach the use of any speaker dependent models that are updated without the use of a 
training phase, as Maes clearly teaches that such a training phase is necessary", the 
claim language does not specifically claim "speaker dependent models that are updated 
without the use of a training phase". Furthermore, the phrase "user-specific speech 
models adapted to specific users" in the claim can be interpreted as speech models 
belonged to particular users that are used for the particular users upon identification. 

Claim Rejections - 35 USC § 101 

3. 35 U.S.C. 101 reads as follows: 
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Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

4. Claim 51 is rejected under 35 U.S.C. 101 because the claimed invention is 
directed to non-statutory subject matter. 

5. Claim 51 is drawn to a "program" perse as recited in the preamble (the 
specification does not mention the term computer-readable medium containing 
computer program product) and as such is non-statutory subject matter. See MPEP § 
2106.IV.B.1.a. Data structures not claimed as embodied in computer readable media 
are descriptive material perse and are not statutory because they are not capable of 
causing functional change in the computer. See, e.g., Warmerdam, 33 F.3d at 1361, 31 
USPQ2d at 1760 (claim to a data structure perse held nonstatutory). Such claimed 
data structures do not define any structural and functional interrelationships between 
the data structure and other claimed aspects of the invention, which permit the data 
structure's functionality to be realized. In contrast, a claimed computer readable 
medium encoded with a data structure defines structural and functional 
interrelationships between the data structure and the computer software and hardware 
components which permit the data structure's functionality to be realized, and is thus 
statutory. Similarly, computer programs claimed as computer listings perse, i.e., the 
descriptions or expressions of the programs are not physical "things." They are neither 
computer components nor statutory processes, as they are not "acts" being performed. 
Such claimed computer programs do not define any structural and functional 
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interrelationships between the computer program and other claimed elements of a 
computer, which permit the computer program's functionality to be realized. 

Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

7. Claims 1-5, 7-12, 15, 18-24, 26-28, 30-39, 42-50, 52, 53-59, 51(30)-51(39), and 
51(42)-51(50) are rejected under 35 U.S.C. 103(a) as being unpatentable over Saylor et 
al. (US 6792086) in view of Maes (US 6088669). 

8. Regarding claims 1, 30, 53-59 and 51(30), Saylor et al. disclose a voice portal 
hosting system, intended to be connected to a first voice telecommunication network in 
order for a plurality, of users in said network to establish a connection with the system 
using voice equipment, said system comprising: 

a memory in which a plurality of interactive voice response applications providing 
interactive response functionality is stored, each of said applications including an 
executable component for execution by said hosting system (VP AGE Database 50 in 
figure 3, voice response application includes TML, XML, VoiceXML, WML, and others in 
col. 21, lines 10-45); 
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a common speech recognition module {voice to text system 62 in figure 3); 

a user identification module for identifying a user (col. 7, line 58 to col. 8, line 15); 

uploading means for independently uploading said plurality of interactive voice 
response applications, to said system, by a plurality of independent value-added service 
providers (col. 20, line 64 to col. 21, line 45 and or referring to figure 3, content provider 
70 provides information to VPAGE Server 22), and wherein the identified user interacts 
with one or more of said interactive voice response application (col. 8, lines 1-38, 
identified is allowed to access voice services), and wherein each of said interactive 
voice response applications includes an executable component for execution by said 
hosting system, said executable component comprising at least one of an executable 
file, a Java Bean, a Corba-component, a compiled software module, and a pre-compiled 
software module (VPAGE Database 50 in figure 3, voice response application includes 
TML, XML, VoiceXML, WML, and others in col. 21, lines 10-45; these are executable 
components). 

Saylor et al. fail to specifically disclose means for storing a plurality of user- 

specific speech models adapted to specific users for use by the common speech 
recognition module; means for retrieving the user-specific speech model of the identified 
user from said plurality of models; and wherein said one or more interactive voice 
response applications utilize said retrieved user-specific speech models via said 
common speech recognition module for recognizing speech of the identified user. 

However, Maes teaches means for storing a plurality of user-specific speech 
models adapted to specific users for use by the common speech recognition module 
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(speaker-dependent HMM models 440 in figure 1)\ means for retrieving the user- 
specific speech model of the identified user from said plurality of models (the operation 
of figure 1, after the speaker is identified by speaker identification module 410, the HMM 
models of the identified speaker is retrieved and loaded into speech recognizer 120)] 
and wherein said one or more interactive voice response applications utilize said 
retrieved user-specific speech models via said common speech recognition module for 
recognizing speech of the identified user (the operation of figure 1, after the speaker is 
identified by speaker identification module 410 \ the HMM models of the identified 

speaker is retrieved and loaded into speech recognizer 120). 
j - . 

Since Saylor et al. and Maes are analogous art because they are from the same 

endeavors, it would have been obvious to one of ordinary skill in the art at the time of 

invention to modify Saylor et al. by incorporating the teaching of Maes in order to 

improve speech recognition accuracy by using user-specific speech models. 

9. Regarding claims 50, and 51(50), Saylor et al. disclose a method for allowing 
each of a plurality of independent value-added service providers to set up an interactive 
voice response applications each including an executable component for execution by a 
voice portal hosting system commonly used by said plurality of valued-added service 
providers and which can be used by a plurality of users (the operation of figure 1, 
multiple users access voice services at the server having a common speech recognizer, 
and independent service providers connected to the server providing voice response 
applications), said method comprising the steps of: 
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independently uploading, through a second telecommunication network, said 
interactive voice response applications to said system for providing interactive voice 
response functionality (col. 20, line 64 to col. 21, line 45 and or referring to figure 3, 
content provider 70 provides information to VPAGE Server 22)\ 

identifying a user calling said system (col. 7, line 58 to col. 8, line 15); 

retrieving speech models for the speech recognizer (voice to text system 62 in 
figure 3, uses system speech recognition models to recognize speech)] 

executing one or more of said voice response applications in response to the 
user calling said system (VPAGE Database 50 in figure 3, voice response application 
includes TML, XML, VoiceXML, WML, and others in col. 21, lines 10-45), said executing 
including interacting with said user via said common speech module using said 
retrieved speech model for recognizing the speech of the user (voice to text system 62 
in figure 3, uses system speech recognition models to recognize speech), wherein each 
of said interactive voice response applications includes an executable component for 
execution by said hosting system, said executable component comprising at least one 
of an executable file, a Java Bean, a Corba-component, a compiled software module, 
and a pre-compiled software module (VPAGE Database 50 in figure 3, voice response 
application includes TML, XML, VoiceXML, WML, and others in col. 21, lines 10-45; 
these are executable components). 

Saylor et al. fail to specifically disclose means for storing a plurality of user- 
specific speech models adapted to specific users for use by the common speech 
recognition module; means for retrieving the user-specific speech model of the identified 
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user from said plurality of models; and wherein said one or more interactive voice 
response applications utilize said retrieved user-specific speech models via said 
common speech recognition module for recognizing speech of the identified user. 

However, Maes teaches means for storing a plurality of user-specific speech 
models adapted to specific users for use by the common speech recognition module 
(speaker-dependent HMM models 440 in figure 1); means for retrieving the user- 
specific speech model of the identified user from said plurality of models (the operation 
of figure 1, after the speaker is identified by speaker identification module 410, the HMM 
models of the identified speaker is retrieved and loaded into speech recognizer 120); 
and wherein said one or more interactive voice response applications utilize said 
retrieved user-specific speech models via said common speech recognition module for 
recognizing speech of the identified user (the operation of figure 1, after the speaker is 
identified by speaker identification module 410, the HMM models of the identified 
speaker is retrieved and loaded into speech recognizer 120)] and wherein said common 
speech models are adapted during each dialog between said users and any of said 
interactive voice response applications (col. 4, lines 42-57, speech model adaptation). 

Since Saylor et al. and Maes are analogous art because they are from the same 
endeavors, it would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Saylor et al. by incorporating the teaching of Maes in order to 
improve speech recognition accuracy by using user-specific speech models. 
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10. Regarding claim 52, Saylor et al. disclose a voice portal hosting system allowing 
a plurality of users to establish a connection with said system using voice equipment for 
interacting with one or more of a plurality of service providers, said system comprising: 

means for independently uploading a plurality of interactive voice response 
applications from said service provides, to said system, via a communication channel 
(col. 20, line 64 to col. 21, line 45 and or referring to figure 3, content provider 70 
provides information to VPAGE Server 22), each of said voice response applications for 
providing interactive voice response functionality for a corresponding one of said service 
providers when executed by said hosting system (VPAGE Database 50 in figure 3, 
voice response application includes TML, XML, VoiceXML, WML, and others in col. 21, 
lines 10-45); 

means for storing said plurality of interactive voice response applications 
(VPAGE Database 50 in figure 3, voice response application includes TML, XML, 
VoiceXML, WML, and others in col. 21, lines 10-45); 

a common speech recognition module (voice to text system 62 in figure 3); 

means for storing a plurality of speech models adapted to specific users for use 
by the common speech recognition module (voice to text system 62 in figure 3, uses 
system speech recognition models to recognize speech); 

a user identification module for identifying a user calling said system via another 
communication channel (col. 7, line 58 to col. 8, line 15); 
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means for retrieving the speech model of the identified user from said plurality of 
models (voice to text system 62 in figure 3, uses system speech recognition models to 
recognize speech), wherein 

the identified user interacts with one or more of said interactive voice response 
applications (col. 8, lines 1-38, identified is allowed to access voice services): and 

wherein each of said interactive voice response applications includes an 
executable component for execution by said hosting system, said executable 
component comprising at least one of an executable file, a Java Bean, a Corba- 
component, a compiled software module, and a pre-compiled software module (VP AGE 
Database 50 in figure 3, voice response application includes TML, XML, VoiceXML, 
WML, and others in col. 21, lines 10-45; these are executable components). 

Saylor et al. fail to specifically disclose means for storing a plurality of user- 
specific speech models adapted to specific users for use by the common speech 
recognition module; means for retrieving the user-specific speech model of the identified 
user from said plurality of models; said one or more interactive voice response 
applications utilize said retrieved user-specific speech model via said common speech 
recognition module for recognizing speech of the identified user, and further wherein 
said common speech models are adaptable during dialogue between said users and 
any of said interactive voice response applications. 

However, Maes teaches means for storing a plurality of user-specific speech 
models adapted to specific users for use by the common speech recognition module 
(speaker-dependent HMM models 440 in figure 1)\ means for retrieving the user- 



Application/Control Number: 09/835,237 Page 1 1 

Art Unit: 2626 

specific speech model of the identified user from said plurality of models (the operation 
of figure 1, after the speaker is identified by speaker identification module 410, the HMM 
models of the identified speaker is retrieved and loaded into speech recognizer 120); 
said one or more interactive voice response applications utilize said retrieved user- 
specific speech model via said common speech recognition module for recognizing 
speech of the identified user (the operation of figure 1, after the speaker is identified by 
speaker identification module 410, the HMM models of the identified speaker is 
retrieved and loaded into speech recognizer 120), and further wherein said common 
speech models are adaptable during dialogue between said users and any of said 
interactive voice response applications (col. 4, lines 42-57, speech model adaptation). 

Since Saylor et al. and Maes are analogous art because they are from the same 
endeavors, it would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Saylor et al. by incorporating the teaching of Maes in order to 
improve speech recognition accuracy by using user-specific speech models. 

11. Regarding claims 2-5, 31-35, and 51(31)-51(35), Saylor etal. further disclose the 
voice portal hosting system, wherein said common speech recognition module 
comprises a common user profile database (col. 7, line 58 to col. 8, line 15), and 
wherein said common user profile database includes user preferences (col. 7, line 58 to 
col. 8, line 15), and wherein said user preferences include a delivery address for goods 
and/or services ordered with said value-added service providers (col. 7, line 58 to col. 8, 
line 15), wherein said user preferences include a billing address and/or preferences for 
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goods and services ordered with said value-added service providers (col. 7, line 58 to 
col. 8, line 15), wherein said common speech recognition module uses user-specific 
speech models (col. 7, line 58 to col. 8, line 15, voice print authentication). 

12. Regarding claims 20-24, 26-28, 44-49, and 51 (44)-51 (49), Saylor et al. further 
disclose the voice portal hosting system, wherein at least a plurality of said interactive 
voice response applications use a common billing module and a common clearing 
center for dispatching the collected amounts to said value-added service providers 
(Billing Module 46 in figure 2), wherein said common billing module allows for the billing 
of transactions between said users and said value-added service providers on a 
common bill prepared by the operator of said voice portal hosting system (Billing 
Module 46 in figure d), and wherein at least a plurality of said users have a deposit 
account on said voice portal hosting system which can be used for transactions with a 
plurality of said value-added service providers (Billing Module 46 in figure 2), wherein at 
least a plurality of said interactive voice response applications use a user authentication 
module based on an electronic signature and/or on biometric parameters of said users 
(col. 7, line 58 to col. 8, line 15, voice print authentication), wherein said second 
telecommunication network is a TGP/IP network (col. 14, lines 5-25 and/or referring to 
network 20 in figures 1-3), wherein at least some of said interactive voice response 
applications are described with VoiceXML documents (col. 21, lines 10-45), wherein at 
least one free interactive voice response application is made available by the operator 
of the system (col. 21, lines 10-45), and wherein said free interactive voice response 
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application includes a free directory assistance service (co/. 36, line 53 to col. 37, line 
8). 

1 3. Regarding claims 7-8, 36, and 51 (36), Saylor et al. fail to specifically disclose the 
voice portal hosting system, wherein said common speech recognition module uses 
user-specific speech models, means for adapting said common speech models 
associated to a user during each dialogue between said user and each of said 
interactive voice response applications, and wherein said means for adapting said 
common speech models uses recorded users' speech samples for adapting said 
common speech models off-line. 

However, Maes teaches speech recognition module using user-specific speech 
models (speaker-dependent HMM models 440 in figure 1), means for adapting said 
common speech models associated to a user during each dialogue between said user 
and each of said interactive voice response applications (col. 4, lines 42-57, speech 
model adaptation), and wherein said means for adapting said common speech models 
uses recorded users' speech samples for adapting said common speech models off-line 
(storing as speaker-dependent models 440 in figure 1). 

Since Saylor et al. and Maes are analogous art because they are from the same 
endeavors, it would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Saylor et al. by incorporating the teaching of Maes in order to 
improve speech recognition accuracy. 
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14. Regarding claims 9-1 0, Saylor et al. fail to specifically disclose the voice portal 
hosting system of claim 1, wherein said common speech recognition module uses 
Hidden Markov Models, and further comprising a Hidden Markov Models adaptation 
module for adapting said models to said user, and wherein said Hidden Markov 
Models adaptation module allows for an incremental adaptation of said models. 
However, Maes teaches a common speech recognition module uses Hidden Markov 
Models, and further comprising a Hidden Markov Models adaptation module for 
adapting said models to said user (HMM models 440 in figure 1 and adapted in col. 4, 
lines 42-57), and wherein said Hidden Markov Models adaptation module allows for an 
incremental adaptation of said models {col. 4, lines 42-57). 

Since Saylor et al. and Maes are analogous art because they are from the same 
endeavors, it would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Saylor et al. by incorporating the teaching of Maes in order to 
improve speech recognition accuracy. 

15. Regarding claims 1 1-12, 37-38, and 51(37)-51(38), Saylor et al. fail to specifically 
disclose the voice portal hosting system, wherein said common speech recognition 
module uses user-specific language models, and means for adapting said common 
language models associated to a user during each dialogue between said user and 
each of said interactive voice response applications. However, Maes teaches a 
common speech recognition module uses user-specific language models (speaker- 
dependent HMM models 440 in figure 1), and means for adapting said common 
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language models associated to a user during each dialogue between said user and 
each of said interactive voice response applications (col. 4, lines 42-57). 

Since Saylor et al. and Maes are analogous art because they are from the same 
endeavors, it would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Saylor et al. by incorporating the teaching of Maes in order to 
improve speech recognition accuracy. 

16. Regarding claims 15, 18-19, 39, 42-43, 51(39), and 51(42)-51(43), Saylor et al. 
fail to specifically disclose the voice portal hosting system, wherein at least a plurality of 
said interactive voice response applications use a common user identification module 
run on said system, wherein said user identification module uses a voice-based user 
identification module, wherein said common speech recognition module uses a 
speaker-dependant speech recognition algorithm, and wherein said speaker is identified 
by said common user identification module. 

However, Maes further teaches that at least a plurality of said interactive voice 
response applications use a common user identification module run on said system, 
wherein said user identification module uses a voice-based user identification module, 
wherein said common speech recognition module uses a speaker-dependant speech 
recognition algorithm, and wherein said speaker is identified by said common user 
identification module (referring to the operation of figure 1, specifically speaker- 
dependent codebooks 430 in figure 1 for used by speaker identification module 410 in 
figure 1). 
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Since Saylor et al. and Maes are analogous art because they are from the same 
endeavors, it would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Saylor et al. by incorporating the teaching of Maes in order to 
identify the user and the user's profile for used by the speech recognition to improve 
speaker recognition accuracy by using speech speaker-dependent codebook trained by 
users in advance. 

17. Claims 13-14 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Saylor et al. (US 6792086) in view of Maes (US 6088669), as applied to claim 1 , and 
further in view Beyda et al. (US 6487277). 

18. Regarding claims 13-14, Saylor et al. fail to specifically disclose a voice portal 
hosting system of claim 1, wherein said common speech recognition module uses 
selections previously made by said users, and wherein said selections previously made 
by said users are stored in said voice portal hosting system for improving the 
arborescence of the menus. However, Beyda et al. teach common speech recognition 
module uses selections previously made by said users, and wherein said selections 
previously made by said users are stored in said voice portal hosting system for 
improving the arborescence of the menus (see abstract). 

Since Saylor et al. and Beyda et al. are analogous art because they are from the 
same endeavors, it would have been obvious to one of ordinary skill in the art at the 
time of invention to modify Saylor et al. by incorporating the teaching of Beyda et al. in 
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order to tailor the presentation order to the needs of each individual user to improve 
system's efficiency. 

19. Claims 16-17, 40-41, 51(40-41) are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Saylor et al. (US 6792086) in view of Maes (US 6088669), as applied 
to claims 15 and 39, respectively, and further in view of Woods et al. (US 6510417). 

20. Regarding claims 16-17, 40-41, and 51(40-41), Saylor et al. fail to specifically 
disclose that the user identification module uses an identification of the equipment used 
by said user in said first telecommunication network, and being operated by a telecom 
operator of said first telecommunication network, wherein said user identification 
module uses an identification of the equipment used by said user in said first 
telecommunication network even when said identification is not available for the other 
B-subscribers of said first telecommunication network. However, Woods et al. teach 
that the user identification module uses an identification of the equipment used by said 
user in said first telecommunication network, and being operated by a telecom operator 
of said first telecommunication network, wherein said user identification module uses an 
identification of the equipment used by said user in said first telecommunication network 
even when said identification is not available for the other B-subscribers of said first 
telecommunication network (col. 24, lines 39-41). 

Since Saylor et al. and Woods et al. are analogous art because they are from the 
same endeavors, it would have been obvious to one of ordinary skill in the art at the 
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time of invention to modify Saylor et al. by incorporating the teaching of Woods et al. in 
order to allow the system to automatically authenticate users based on their phone 
numbers by using caller-ID procedure. 



Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Huyeh X. Vo whose telephone number is 571-272-7631 
The examiner can normally be reached on M-F, 9-5:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on 571-272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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